the other day i was shell scripting and bored so made a list of all the webservers running on the listed drupal sites. the results:

    130 Apache/1.3.27
     53 Apache/1.3.28
     20 Apache/1.3.26
     20 Apache
     13 Apache/1.3.20
     12 Apache/1.3.19
      7 Apache/2.0.47
      4 Apache/1.3.6
      3 Apache-AdvancedExtranetServer/1.3.26
      2 Apache/2.0.40
      2 Apache/1.3.24
      1 Apache-AdvancedExtranetServer/2.0.44
      1 Apache/2.0.46
      1 Apache/1.3.23
     

question, does anybody know howmany drupal sites are there in the wild? maybe we could make an educated quess by downloads?

Comments

dries’s picture

Interesting. What is the number in front of the webserver? It is not the number of Drupal sites, is it?

For what it is worth; Drupal 4.2.0 has been downloaded exactly 5.007 times in August.

Also in August, the Drupal website itself had 782.794 hits (83.439 unique visits). Of these hits, 667.446 were "Code 200 - OK" and 109.948 of them were "Code 304 - Not Modified". Traffic requirements are between 10 and 11 Gb/month for the website only (excl. mailing list and CVS) all of which is donated by Kjartan!

bertboerland’s picture

being in the hosting bizz, i know what bandwith, housing and hosting costs. so thanks kjartan!

yes, the number in front is the number of hosts running that server. it doesn't sum op to the total number of hosts listed on drupal.org/sites page, some host were down or unreachable.

the code is something like

lynx -dump http://drupal.org/sites | grep -i -v drupal.org | grep -i http | awk {'print $2'} > ./allsites
for i in `cat allsites`
  do
   wget -t1 -O- -s $i | grep -i server >> ./allheaders
done
more allheaders | grep -i server: | awk -F" " '{print $2'} | sort | uniq -c | sort -nr

okay, now everybody knows i shouldnt contribute to the drupal code :-)

--
groets

bertb

--
groets
bert boerland

killes@www.drop.org’s picture

I think the curly braces in your awk statements are not right. awk likes '{foo}' not {'foo'}.

Kjartan’s picture

Note this is for August 2003

CVS usage is up to 4 Gb, On average Drupal CVS related gets hit 150 time a day. The contributions repository actually accounst for most of the traffic, but this is probably due to that repo being 3.36 times the size of the main Drupal repository. If just 2 people do a checkout of the drupal-contrib a day that's 2 gigs of traffic a month.

All mail related traffic on the drupal.org domain is 2.5 Gb a month. This is mostly mailing lists, but it also includes notification mails sent by drupal.org and the select few who have @drupal.org mail adresses. The Drupal-devel is the most active mailing list, having 5 times the volume of drupal-support.

All in all Drupal related stuff requires 15Gb of traffic a month, and its growing nicely every month.

Then again Drupal only counts for 30% of my bandwidth usage a month, so for now its doable.

--
Kjartan